Skip to content

Comments

add: partial copy method#75

Open
TOsmanov wants to merge 34 commits intomasterfrom
add-partial-copy
Open

add: partial copy method#75
TOsmanov wants to merge 34 commits intomasterfrom
add-partial-copy

Conversation

@TOsmanov
Copy link

@TOsmanov TOsmanov commented Mar 14, 2025

Workflows:

  • Tests and lint are separated

Base Foliant backend:

  • Added the partial_copy method, which copies files according to a given list
  • Added tests

CLI:

  • Added the prepare_list_of_files method, which prepares a list of files from a user-specified string

Utils:

  • Updated for compatibility with Python 3.9

Other

  • Python 3.9 is now the minimum required version
  • Added scripts for running tests and for running tests in Docker

@TOsmanov TOsmanov force-pushed the add-partial-copy branch 3 times, most recently from c595750 to 846bce4 Compare March 20, 2025 13:43
@TOsmanov TOsmanov force-pushed the add-partial-copy branch 2 times, most recently from a52146d to ab12f21 Compare March 21, 2025 08:51
@TOsmanov TOsmanov force-pushed the add-partial-copy branch 8 times, most recently from e4af58a to 1ebf3c5 Compare April 17, 2025 13:31
@codecov
Copy link

codecov bot commented May 14, 2025

Codecov Report

❌ Patch coverage is 80.95238% with 44 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.81%. Comparing base (9a39e1f) to head (15b616d).

Files with missing lines Patch % Lines
foliant/partial_copy.py 76.57% 41 Missing ⚠️
foliant/cli/make.py 90.47% 2 Missing ⚠️
foliant/backends/base.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #75      +/-   ##
==========================================
+ Coverage   76.88%   77.81%   +0.92%     
==========================================
  Files          14       15       +1     
  Lines         398      613     +215     
==========================================
+ Hits          306      477     +171     
- Misses         92      136      +44     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TOsmanov TOsmanov force-pushed the add-partial-copy branch 2 times, most recently from 695c683 to ff834f7 Compare May 16, 2025 08:43
@TOsmanov TOsmanov force-pushed the add-partial-copy branch from f3e2907 to 480df1f Compare May 22, 2025 10:12
@TOsmanov TOsmanov marked this pull request as ready for review October 14, 2025 11:32
@TOsmanov TOsmanov requested a review from holamgadol October 14, 2025 11:32
@TOsmanov TOsmanov marked this pull request as draft October 17, 2025 05:35
@TOsmanov TOsmanov marked this pull request as ready for review October 17, 2025 06:21
Comment on lines +253 to +254
lines = post.content.strip().splitlines()
new_content = lines[0] if lines else ''

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness: In _process_content, the first line of content is now always preserved when remove_content is True, even if keep_first_header is False. This makes the flag ineffective and results in unintended content retention. Initialize new_content to an empty string and only assign lines[0] if keep_first_header is True.

🤖 AI Agent Prompt for Cursor/Windsurf

📋 Copy this prompt to your AI coding assistant (Cursor, Windsurf, etc.) to get help fixing this issue

File: foliant/partial_copy.py. Lines 253-259. Issue: keep_first_header=False no longer removes content because the first line is always preserved. Fix by only retaining the first line when keep_first_header is true, and handle header concatenation safely when new_content is empty. Apply the provided diff.

Comment on lines +1 to +6
import os
import re
from pathlib import Path
from shutil import copy
from typing import Union, List, Set
import frontmatter

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate Code: ⚠️ Duplicate Code Detected (Similarity: 81%)

This function extract_first_header duplicates existing code.

📍 Original Location:

foliant/preprocessors/_unescape.py:21-39

Function: process_escaped_tags

💡 Recommendation:
Consider consolidating into a unified content processing utility class or module. Both functions could be methods in a MarkdownContentProcessor class that handles various markdown text operations (extraction, transformation, parsing). The extract_first_header method could use a similar compiled pattern approach as process_escaped_tags for consistency and performance. Alternatively, create a shared base function that handles regex pattern matching on markdown content, with parameters to control whether to extract (search/match) or transform (substitute). This would reduce code duplication in the regex pattern handling and content processing logic.

Consider importing and reusing the existing function instead of duplicating the logic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the analysis, but this appears to be a false positive. The functions are not duplicate code:

  1. Different purpose:

    • process_escaped_tagstransformation (removes < from escaped tags)
    • _extract_first_headerextraction (finds and returns the first header)
  2. Different regex patterns:

    • self.pattern (in process_escaped_tags) — specific to escaped tags, defined in __init__
    • r'^(#{1,6})\s+(.+)$' (in _extract_first_header) — for finding Markdown headers
  3. Different signatures and context:

    • process_escaped_tags — instance method, uses self, includes logging
    • _extract_first_header — static utility method

The ~81% similarity is likely due to the shared high-level structure of "apply regex to string" — but this is the abstraction level of "a function that works with regexes," not duplication of business logic.

Merging these into one class would introduce unnecessary complexity and violate the Single Responsibility Principle. If any optimization is needed, it would be extracting regex compilation to module-level constants — not merging the functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant